Randomized Search Methods for Solving Markov Decision Processes and Global Optimization

نویسنده

Jiaqiao Hu

چکیده

Title of dissertation: RANDOMIZED SEARCH METHODS FOR SOLVING MARKOV DECISION PROCESSES AND GLOBAL OPTIMIZATION Jiaqiao Hu, Doctor of Philosophy, 2006 Dissertation directed by: Professor Steven I. Marcus Department of Electrical and Computer Engineering Professor Michael C. Fu Department of Decision and Information Technology Markov decision process (MDP) models provide a unified framework for modeling and describing sequential decision making problems that arise in engineering, economics, and computer science. However, when the underlying problem is modeled by MDPs, there is a typical exponential growth in the size of the resultant MDP model with the size of the original problem, which makes practical solution of the MDP models intractable, especially for large problems. Moreover, for complex systems, it is often the case that some of the parameters of the MDP models cannot be obtained in a feasible way, but only simulation samples are available. In the first part of this thesis, we develop two sampling/simulation-based numerical algorithms to address the computational difficulties arising from these settings. The proposed algorithms have somewhat different emphasis: one algorithm focuses on MDPs with large state spaces but relatively small action spaces, and emphasizes on the efficient allocation of simulation samples to find good value function estimates, whereas the other algorithm targets problems with large action spaces but small state spaces, and invokes a population-based approach to avoid carrying out an optimization over the entire action space. We study the convergence properties of these algorithms and report on computational results to illustrate their performance. The second part of this thesis is devoted to the development of a general framework called Model Reference Adaptive Search (MRAS) for solving global optimization problems. The method iteratively updates a parameterized probability distribution on the solution space, so that the sequence of candidate solutions generated from this distribution will converge asymptotically to the global optimum. We provide a particular instantiation of the framework and establish its convergence properties in both continuous and discrete domains. In addition, we explore the relationship between the recently proposed CrossEntropy (CE) method and MRAS, and show that the model reference framework can also be used to describe the CE method and study its properties. Finally, we formally discuss the extension of the MRAS framework to stochastic optimization problems and carry out numerical experiments to investigate the performance of the method. RANDOMIZED SEARCH METHODS FOR SOLVING MARKOV DECISION PROCESSES AND GLOBAL OPTIMIZATION

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AFRL-AFOSR-VA-TR-2017-0028 Dynamic Decision Making under Uncertainty and Partial Information

The researchers made significant progress in all of the proposed research areas. The first major task in the proposal involved duality in stochastic control and optimal stopping. In support of this task, the researchers developed new methods for efficiently solving optimal stopping problems of partially observable Markov processes and optimal stopping problems under jump-diffusion processes. Th...

متن کامل

A Modified Discreet Particle Swarm Optimization for a Multi-level Emergency Supplies Distribution Network

Currently, the research of emergency supplies distribution and decision models mostly focus on deterministic models and exact algorithm. A few of studies have been done on the multi-level distribution network and matheuristic algorithm. In this paper, random processes theory is adopted to establish emergency supplies distribution and decision model for multi-level network. By analyzing the char...

متن کامل

(YIP) Dynamic Decision Making Under Uncertainty and Partial Information

متن کامل

Producing efficient error-bounded solutions for transition independent decentralized mdps

There has been substantial progress on algorithms for single-agent sequential decision making using partially observable Markov decision processes (POMDPs). A number of efficient algorithms for solving POMDPs share two desirable properties: error-bounds and fast convergence rates. Despite significant efforts, no algorithms for solving decentralized POMDPs benefit from these properties, leading ...

متن کامل

Final Performance Report Grant FA

The researchers made significant progress in all of the proposed research areas. The first major task in the proposal involved simulation-based and sampling methods for global optimization. In support of this task, we have discovered two new innovative approaches to simulation-based global optimization; the first involves connections between stochastic approximation and our model reference appr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

Randomized Search Methods for Solving Markov Decision Processes and Global Optimization

نویسنده

چکیده

منابع مشابه

AFRL-AFOSR-VA-TR-2017-0028 Dynamic Decision Making under Uncertainty and Partial Information

A Modified Discreet Particle Swarm Optimization for a Multi-level Emergency Supplies Distribution Network

(YIP) Dynamic Decision Making Under Uncertainty and Partial Information

Producing efficient error-bounded solutions for transition independent decentralized mdps

Final Performance Report Grant FA

عنوان ژورنال:

اشتراک گذاری